954 resultados para Models, Statistical


Relevância:

80.00% 80.00%

Publicador:

Resumo:

Federal Highway Administration, Office of Safety and Traffic Operations, Washington, D.C.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The intent of this note is to succinctly articulate additional points that were not provided in the original paper (Lord et al., 2005) and to help clarify a collective reluctance to adopt zero-inflated (ZI) models for modeling highway safety data. A dialogue on this important issue, just one of many important safety modeling issues, is healthy discourse on the path towards improved safety modeling. This note first provides a summary of prior findings and conclusions of the original paper. It then presents two critical and relevant issues: the maximizing statistical fit fallacy and logic problems with the ZI model in highway safety modeling. Finally, we provide brief conclusions.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Among the largest resources for biological sequence data is the large amount of expressed sequence tags (ESTs) available in public and proprietary databases. ESTs provide information on transcripts but for technical reasons they often contain sequencing errors. Therefore, when analyzing EST sequences computationally, such errors must be taken into account. Earlier attempts to model error prone coding regions have shown good performance in detecting and predicting these while correcting sequencing errors using codon usage frequencies. In the research presented here, we improve the detection of translation start and stop sites by integrating a more complex mRNA model with codon usage bias based error correction into one hidden Markov model (HMM), thus generalizing this error correction approach to more complex HMMs. We show that our method maintains the performance in detecting coding sequences.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

A significant challenge in the prediction of climate change impacts on ecosystems and biodiversity is quantifying the sources of uncertainty that emerge within and between different models. Statistical species niche models have grown in popularity, yet no single best technique has been identified reflecting differing performance in different situations. Our aim was to quantify uncertainties associated with the application of 2 complimentary modelling techniques. Generalised linear mixed models (GLMM) and generalised additive mixed models (GAMM) were used to model the realised niche of ombrotrophic Sphagnum species in British peatlands. These models were then used to predict changes in Sphagnum cover between 2020 and 2050 based on projections of climate change and atmospheric deposition of nitrogen and sulphur. Over 90% of the variation in the GLMM predictions was due to niche model parameter uncertainty, dropping to 14% for the GAMM. After having covaried out other factors, average variation in predicted values of Sphagnum cover across UK peatlands was the next largest source of variation (8% for the GLMM and 86% for the GAMM). The better performance of the GAMM needs to be weighed against its tendency to overfit the training data. While our niche models are only a first approximation, we used them to undertake a preliminary evaluation of the relative importance of climate change and nitrogen and sulphur deposition and the geographic locations of the largest expected changes in Sphagnum cover. Predicted changes in cover were all small (generally <1% in an average 4 m2 unit area) but also highly uncertain. Peatlands expected to be most affected by climate change in combination with atmospheric pollution were Dartmoor, Brecon Beacons and the western Lake District.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Two taxonomies for the accurate classification of human and predicted exons were produced. Based on these taxonomies important statistical properties of untranslated exons useful for improving automated genefinding efforts were calculated. Finally an important correlation between the energy and the information content in the human genome was identified.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Many techniques used to model ecosystems cannot be meaningfully applied to large-scale ecological problems due to data constraints. Disparate collection methods, data types and incomplete data sets, or limited theoretical understanding mean that a wide range of modelling techniques used to model physical processes or for problems specific to species or populations cannot be used at an ecosystem scale. In developing an ecological response model for the Coorong, a South Australian hypersaline estuary, we combined several flexible modelling approaches in a statistical framework to develop an approach we call ‘ecosystem states’. This model uses simulated hydrodynamic conditions as input to predict one of a suite of states per space and time, allowing prediction of likely ecological conditions under a variety of scenarios. Each ecosystem state has defined sets of biota and physico-chemical parameters. The existing model is limited in that its predictions have yet to be tested and, as yet, no spatial or temporal connectivity has been incorporated into simulated time series of ecosystem states. This approach can be used in a wide range of ecosystems, where enough data are available to model ecosystem states. We are in the process of applying the technique to a nearby lake system. This has been more difficult than for the Coorong as there is little overlap in the spatial and temporal coverage of biological data sets for that region. The approach is robust to low-quality biological data and missing environmental data, so should suit situations where community or management monitoring programs have occurred through time.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

We investigate feature stability in the context of clinical prognosis derived from high-dimensional electronic medical records. To reduce variance in the selected features that are predictive, we introduce Laplacian-based regularization into a regression model. The Laplacian is derived on a feature graph that captures both the temporal and hierarchic relations between hospital events, diseases, and interventions. Using a cohort of patients with heart failure, we demonstrate better feature stability and goodness-of-fit through feature graph stabilization.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Studies investigating the use of random regression models for genetic evaluation of milk production in Zebu cattle are scarce. In this study, 59,744 test-day milk yield records from 7,810 first lactations of purebred dairy Gyr (Bos indicus) and crossbred (dairy Gyr × Holstein) cows were used to compare random regression models in which additive genetic and permanent environmental effects were modeled using orthogonal Legendre polynomials or linear spline functions. Residual variances were modeled considering 1, 5, or 10 classes of days in milk. Five classes fitted the changes in residual variances over the lactation adequately and were used for model comparison. The model that fitted linear spline functions with 6 knots provided the lowest sum of residual variances across lactation. On the other hand, according to the deviance information criterion (DIC) and Bayesian information criterion (BIC), a model using third-order and fourth-order Legendre polynomials for additive genetic and permanent environmental effects, respectively, provided the best fit. However, the high rank correlation (0.998) between this model and that applying third-order Legendre polynomials for additive genetic and permanent environmental effects, indicates that, in practice, the same bulls would be selected by both models. The last model, which is less parameterized, is a parsimonious option for fitting dairy Gyr breed test-day milk yield records. © 2013 American Dairy Science Association.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Various inference procedures for linear regression models with censored failure times have been studied extensively. Recent developments on efficient algorithms to implement these procedures enhance the practical usage of such models in survival analysis. In this article, we present robust inferences for certain covariate effects on the failure time in the presence of "nuisance" confounders under a semiparametric, partial linear regression setting. Specifically, the estimation procedures for the regression coefficients of interest are derived from a working linear model and are valid even when the function of the confounders in the model is not correctly specified. The new proposals are illustrated with two examples and their validity for cases with practical sample sizes is demonstrated via a simulation study.